AITopics | expectation suite

Collaborating Authors

expectation suite

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

How to Test PySpark ETL Data Pipeline

#artificialintelligenceDec-6-2022, 15:40:13 GMT

Garbage in garbage out is a common expression used to emphasize the importance of data quality for tasks such as machine learning, data analytics and business intelligence. With increasing amount of data being created and stored, building high quality data pipelines have never been more challenging. PySpark is a commonly used tool to build ETL pipelines for large datasets. A common question that arises while building data pipeline is "How do we know that our data pipeline is transforming the data in the way that is intended?". To answer this question, we borrow the idea of unit test from the software development paradigm.

data pipeline, expectation suite, pipeline, (11 more...)

#artificialintelligence

Technology:

Information Technology > Data Science > Data Integration (0.61)
Information Technology > Artificial Intelligence > Representation & Reasoning > Information Fusion (0.61)
Information Technology > Data Science > Data Quality (0.56)

Add feedback

Reducing Pipeline Debt With Great Expectations

#artificialintelligenceApr-29-2022, 15:15:17 GMT

This article was first published on Neptune AI's blog. You are a part of a data science team at a product company. Your team has a number of machine learning models in place. Their outputs guide critical business decisions, as well as a couple of dashboards displaying important KPIs that are closely watched by your executives day and night. On that fatal day, you had just brewed yourself a cup of coffee and were about to begin your workday when the universe collapsed. Everyone at the company went crazy. The business metrics dashboard was displaying what seemed to be random numbers (except every full hour, when the KPIs look okay for a short time) and the models were predicting the company's insolvency looming fast. What is worse, every attempt to resolve this madness resulted in your data engineering and research teams reporting new broken services and models. That was the debt collection day and the unpaid debt was of the worst kind: pipeline debt.

data pipeline, expectation suite, great expectation, (16 more...)

#artificialintelligence

Country: North America > United States > Illinois > Cook County > Chicago (0.04)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning (0.91)

Add feedback